8bitfiles.net/archives

home *** CD-ROM | disk | FTP | other *** search

/ 8bitfiles.net/archives / archives.tar / archives / compuserve-file-archive / 05 Programming / EA2.DOC < prev next >

Wrap

Text File | 2019-04-13 | 23.1 KB | 700 lines

Documentation for EA - Part Two of Two. EA is an editor-assembler. Copyright 1984 by Lew Lasher. Overview of assembler features: EA is an assembler for the Commodore 64. An assembler is a program used to translate machine language programs from person-readable form (the source program) to a machine-usable form (the object program). The assembler allows you to write, maintain, and modify a large, complicated machine language program. Operation: EA requires the source file to be put in one or more text files on a floppy disk. After you have created the source files, and written them to disk (using the C= W command), you can run the assembler by typing C= A. EA asks you for the name of the source file. If you type "blurfo" for the name, EA first looks for a file named "blurfo". If it cannot find "blurfo", then it looks for "blurfo.a". If it cannot find either one, it prints an error message and returns to the editor. If it finds either one, it creates an object file called "blurfo.o". The filename you type must not be longer than 14 characters, to make room for the extension ".o". Then EA asks you if you want to make a list file. You should type "y" or "n". The list file, which in the current example would be named "blurfo.l", lists the numeric value of every byte in the object file, alongside the address at which the byte will be loaded and the source line corresponding to that byte. List files are useful in debugging programs. All these files must be on a disk drive with device number 8 and, if a dual drive, drive number 0. After optionally creating the list file, EA begins the first pass. Unless there is an error on the disk drive, you will not see anything on the screen during the first pass. No output is written to either the object file or the optional list file during the first pass. The only purpose of the first pass is to figure out the addresses where the program will be located, and thereby assign numeric values to each label and symbol in your program (See below for more about symbols and labels). After the first pass is completed, EA prints the message "Beginning 2nd pass" and does just that. During the 2nd pass, you should see on the screen a listing of your source program, but with various numbers over in the left margin, and various error messages preceding the erroneous source lines. If you chose to make a list file, the text you see on the screen will be the text you get in the list file. Meanwhile the object program is being written to the object file. Finally you will see the message "Errors detected: 0" which indicates that EA did not detect any of the errors in your program. Note that any errors that EA detects are, in a sense, "mere warnings", in that (except for I/O errors), you still are given an object file which can be loaded and run. At any point in either the first or second pass, you may cut off the assembly by hitting the RUN/STOP key. You should delete the incomplete object and list files which will have been created. After EA exits, you can load and run the object file by typing: load "blurfo.o",8,1 sys n where n is the address of the start of your program. Miscellaneous specifications: EA will not work with disk drives other than a Commodore 1541 or an MSD SD-2. If changes are made to the ROM's in these models, it may not even work with them. Differences from other assemblers: There are numerous differencs between EA and other assemblers for the C-64. Many of these are "added features" of EA which will not interfere with the assembly of programs written for another assembler. The following are the major differences which must be taken into account to modify a program written for another assembler: 1. Every label must be followed by a colon (":"). 2. The directive * = n is not supported. In other assemblers, this directive is used primarily for two purposes. The first purpose is to establish the address for the first byte in the object program. For that purpose, replace: * = 2049 with: .origin 2049 The second common purpose is to allocate an area of memory without specifing the contents of that memory. For that purpose, replace: * = * + 1 with: .blkb 1 3. Commodore's assembler has a directive called ".byt" with can be used to generate both numeric and text data. In EA, this directive is called ".byte", and only generates numeric data, while the .ascii directive generates text data. Therefore, replace: .byt 15, 'Text string', 0 with: .byte 15 .ascii 'Text string' .byte 0 4. EA lacks the macro directives, conditional assembly directives, and other specialized directives found in other assemblers. 5. EA does not require or allow the ".end" directive which many assemblers require at the end of a source file. Actually, EA ignores the .end as an unrecognized directive, but you should delete the .end to avoid the error mesage. Format of source file: The source file should be a sequential file, with each line terminated by a RETURN characater. Lines are limited to 255 characters, not including the RETURN character. Blank lines are legal, and spaces can be put in liberally to improve readability (exceptions: spaces may not be put in the middle of a numeral or an identifier). Upper and lower case letters are treated identically except in .ascii or .asciz directives and in character literals. A semicolon (";") is used to indicate the beginning of a comment. All text after the semicolon, and the semicolon itself, is ignored (exceptions: text strings in .ascii and .asciz directives and character literals). A valid line can be any of the following: 1. A blank line. 2. A symbol definition, e.g.: cursordown = 17 3. An instruction for the 6510 microprocesor, e.g.: lda # 0 4. An assembler directive, e.g.: .byte 5 5. Any of the above (1-4) preceded by one or more labels, e.g.: start: startloop: lda # 0 length: .byte 0 blankline: ; comment Blank lines: Blank lines are useful for the following purposes: to make the source or list file more readable, to hold a label, and to hold comments. I apologize to people who would not consider a line having a label and/or a comment to be a "blank line". Symbol definitions: A symbol definition is used to define a name to represent a numeric value. One use of symbols is to refer to addresses not located within the object file, for example, KERNAL routines, page zero, the special addresses used by the video, sound, and I/O chips, and vectors: chrout = 65490 current.key = 197 color.memory = $D800 ADSR1 = 54277 cia1.data.port.A = $dc00 irq.vector = 788 Symbols can be used for various constants in your program, to make the program easier to understand and to modify, e.g.: space.char = 32 disk.devicenum = 8 pointer = 251 Sometimes you may use a symbol to refer to a memory location within the object file. In general, labels are preferred over symbols for this purpose. However, note that the following are very similar in effect: location: ; label location = * ; symbol The difference is that a symbol can be redefined many times within the program without error, while a label is supposed to refer to only one location. The one instance in which it makes sense to use a symbol is to mark the beginning of a loop: loop = * ; body of loop dex bne loop Since the assembler uses the most recently-assigned value of the symbol, this pattern can be used for many loops, without having to think of distinct names for the start of each loop. But note that this pattern will fail miserably for a forward branch, or for a loop within a loop. Because the first pass of the assembler is devoted to defining labels and symbols, you can refer to a symbol or label that is defined later in the source file. This is called a "forward reference". More precisely, you are allowed one level of forward reference. For example: a = b + 40 b = 1024 is legal, but: a = b b = c c = 100 is not legal. You should not, however, use even a single-level forward reference in a .blkb or .blkw directive. (These directives are described below). The assembler uses these directives in the first pass to figure out how much memory to allocate. If a symbol used in a .blkb or .blkw directive is undefined on the first pass, then all subsequent labels will be defined incorrectly. Symbols that refer to page zero should be defined before they are used. Since instructions referring to page zero usually take 2 bytes, as opposed to 3 bytes for addresses on any other page, the assembler needs to know during the first pass whether an instruction refers to page zero or not. If the address is specified by an undefined symbol, the assembler assumes that the address is NOT on page zero. If this assumption proves incorrect, all subsequently-defined labels will be defined incorrectly. Therefore, all symbols referring to page zero MUST be defined before the instructions that refer to the symbols. Since it is an error to redefine a label, it is also an error to define a symbol with the same name as a label. Identifiers: An "identifier" is the name used for a symbol or label. An identifier must start with a letter, and can have letters, numeric digits, or periods for subsequent characters. The length of an identifier is not limited other than by the maximum length (255 characters) of a source line. Upper and lower case letters are treated identically in identifiers. The names A, X, and Y should not be used, even though legal, because other assemblers forbid their use. This assembler allows their use, except that a lone A as an operand to a 6510 instruction will be confused for a reference to the 6510's accumulater. Examples of legal identifiers: and illegal identifiers: Q 8 A3333333 3AAAAAAA a.b.c .a.b.c Machine instructions: (After all, the whole point of an assembler is to let you enter machine instructions.) EA recognizes the standard 56 instruction mnemonics, which you may type in either upper or lower case. EA uses the standard punctuation for the various addressing modes: clc ; implied adc # 1 ; immediate asl A ; accumulator ldx Z ; zero page ora Z,X ; zero page, X and Z,Y ; zero page, Y cpx E ; absolute ldy E,X ; absolute, X ldx E,Y ; absolute, Y beq E ; relative eor (Z,X) ; (indirect,X) sbc (Z),Y ; (indirect),Y jmp (Z) ; absolute indirect where Z is an expression whose value is between 0-255, and E is an expression whose value is not necessarily between 0-255. (The exact definition of an "expression" will be given later.) Note that it is often not the punctuation, but the value of the expression, which distinguishes certain addressing modes. You may, but need not, put in spaces before or after a "(", ")", ",", or "#". You must put in a space after the name of the instruction if the operand starts with a letter. Expressions: An expression is a bunch of text that represents a numeric value. The following are legal expressions: 1. A decimal numeral (0 to 65535): 1024 2. A hexadecimal numeral, preceded by a dollar sign: $ff $DC00 3. A binary numeral, preceded by a percent sign: %10000001 4. A character literal, preceded by an apostrophe: 'A 5. An identifier: loop color.memory disk.devicenum The numeric value of an identifier is given by its most recent definition as a symbol or as a label. 6. The "location counter", either a period or asterisk: . * The numeric value of the location counter is the address in the object program corresponding to the beginning of the current source line. 7. An expression preceded by a "unary operator": < E ; low-order byte of E > E ; high-order byte of E - E ; negative E (two's complement) ? E ; logical (one's) complement of E 8. An expression followed by a "binary operator" followed by a second expression: E1 + E2 ; addition E1 - E2 ; subtraction E1 * E2 ; multiplication E1 / E2 ; integer division E1 % E2 ; modulo (remainder after integer division) E1 & E2 ; logical AND E1 ! E2 ; logical OR E1 ^ E2 ; arithmetic shift ; left shift if E2 is positive ; right shift if E2 is negative If more than one operator is used in the same expression, they are done in the following order: ? - (negative) * % / - (subtraction) + & ! ^ < >. If the same operator appears twice, evaluation is from right to left. You can override the normal order of evaluation by using brackets ("[" and "]") as parentheses. Note, however, that all these arithmetic operations may be done only on constant expressions. That is, you cannot use the above operators to add the contents of various memory locations. The usefulness of elaborate assembler expressions is primarily to define one symbol in terms of another, so that if you change one symbol, various other symbols are automatically redefined. Assembler directives: The assembler directives perform various miscellaneous functions: .byte .byte Z .byte Z1, Z2, ... , Z3 The .byte directive allocates one or more bytes in the object file. If an expression is given, the expression is put in as the initial value of the byte. If more than one expression is given, separated by commas, than a byte is allocated for each expression. If no expression is given, a single byte is allocated with an initial value of zero. Only the low-order byte of the expression is used. .word .word E .word E1, E2, ... , E3 The .word directive is very similar to the .byte directive, but two bytes are allocated for each expression, the first for the low-order byte and the second for the high-order byte. If no expression is given, then two bytes are allocated, both initialized to zero. Note that .word E is exactly equivalent to: .byte < E, > E .blkb .blkb E The .blkb directive is used to allocate memory in the object program without specifying the initial values. If an expression is given, that number of bytes is allocated. If no expression is given, 1 byte is allocated. But if the value of the expression is zero, no bytes are allocated. If the expression is negative or greater than 32767, an error message is generated and no bytes are allocated. .blkw .blkw E The .blkw directive is similar to the .blkb directive, except that the number of bytes to be allocated is multiplied by two. .ascii "Text string" The .ascii directive is used to allocate memory in the object program for a text string. One byte is allocated for each character in the text string, and each byte is initialized with the character code (not the code for screen memory) for the character from the source line. The text string can be delimited by any character not found within the text string, with the exception that a semicolon (";"), a space, or a RETURN character may not be used as the delimiter. Note that the character codes are not true ASCII but the Commodore variant, sometimes called PETSCII. .asciz /Text string/ The .asciz directive is similar to the .ascii directive, except that a zero byte is allocated after the text string. .origin E The .origin directive is used at the beginning of the source file to specify the address to be used for the first byte in the object file. In fact, the .origin directive need not be literally on the first line of the source file, but it must come before any source lines that allocate memory in the object file. If you do not include a .origin directive, the default origin is 49152 ($C000 hexadecimal). .print E text The .print directive is used to print messages on the terminal and into the optional list file. The expression following the .print is evaluated. The numeric value of that expression is printed, in decimal, to the screen and into the optional list file, followed by any other text on the source line. The main use of the .print directive is to produce a memory map rather than a full listing of every line of the program. The output from the .print directive is performed even if the .nolist directive is in effect. Examples: .print subroutine subroutine subroutine: ... ... rts .print * - subroutine, length of subroutine .print * End of file .print * - start, Size of file .nolist The .nolist directive turns off the listing of every source line to the screen and to the optional list file. The output of the .print directive is still performed, however. .list The .list directive turns back on the listing of every source line, after a previous .nolist directive. In this way, you can make a list file of only those particular routines which you are interested in debugging. .include identifier The .include directive is used to incorporate another source file as though it had been merged with the principal source file. This allows you to have a library of common subroutines without having to make a separate copy of the subroutines in each program in which you use them. It is also useful to make several, slightly different versions of the same program. It is also useful to divide up a large program for more convenient editing. It is also useful for incorporating a list of common symbol definitions, such as the addresses of the KERNAL routines. The assembler first looks for a file whose name is given by the identifier (converted to lower case). If that file cannot be found, it adds ".a" to the filename and looks for that name. After the assembler has finished processing the included source file, it returns to the principal source file and continues from the line following the .include directive. The included file can include a "third-level" source file, but there is a maximum of four levels of inclusion. Error messages: File name may be 1 to 14 chars long If, when the assembler initially asks for the name of the source file, you hit the RETURN key with no name, or give a name longer than 14 characters, you get this message. Couldn't create object file: Couldn't create list file: Error reading source file: Error writing to list file: Error writing to list file: Error closing object file: Error closing list file: Error re-reading source file Also prints the error message given by the disk drive Couldn't open control or random channel May indicate that you ran EA while file channels were still opened from another program. Line too long Source line longer than 255 characters. Assembly terminated by STOP key Unrecognized statement The first non-blank character in the source line was neither a letter nor a period. May indicate that the source line contains a non-printing character. Unrecognized opcode May indicate a misspelled opcode, or the accidental omission of the colon after a label or of the = following a symbol definition. Extra input ignored May indicate the omission of the semicolon preceding a comment. May also indicate the omission of an operator from an expression. No character after ' Indicates that an apostrophe was the last character in the source line. Malformed expression Warning - extra [ Warning - extra ] Warning - arithmetic overflow Indicates an arithmetic operation whose result required more than 16 bits. Warning - division by zero Internal assembler error This message indicates that the assembler was not able to evaluate an expression properly. Second argument to ^ must be -15 to 15 Undefined symbol: identifier Symbol table full Indicates that the assembler was not able to create a label or symbol because of insufficient available memory. Warning - redefining label: This may indicate that the source file has two different labels with the same name, or a symbol and a label with the same name. It also may indicate a duplicate name from an include file. If, however, you get this error message for every label in the source file, it means that the value of the labels changed from the first pass to the second pass. This, in turn, usually means that a symbol was used in a .blkb or .blkw directive before it was defined, or that a symbol referring to page zero was used before it was defined. Unrecognized addressing mode Improper addressing mode for opcode Not all the possible addressing modes can be used for any particular opcode. See the C-64 Programmer's Reference Manual, or any other book about the 6502, for a description of the permissible addressing modes for each opcode. Expression truncated An expression used with an opcode in immediate mode or used in a .byte directive was not within the range -256 to 255. Branch out of range The various branch instructions for the 6510 may only be used to branch to a location less than 254 bytes before the branch instruction or less than 253 bytes after the branch instruction. This error message is also often given if the expression specifying the address uses an undefined identifier. Unrecognized directive A source line began with a period, but was not one of the assembler directives described above. Warning - extra comma A .byte or .word directive had a trailing comma after the last expression. Warning - comma inserted A .byte or .word directive had two expressions not separated by a comma. Argument must be positive The argument to .blkb or to .blkw was negative, or greater than 32767. Missing string An .ascii or .asciz directive did not have a text string, or used a semicolon as the delimiter. Unterminated string An .ascii or .asciz directive was followed by a text string which was not followed by the delimiter character preceding the text string. Origin can only be set once at start The source file contained two .origin directives, or contained an .origin directive after a source line which allocated memory. May indicate that an include file had an .origin directive. Filename must start with a letter A .include directive was not followed by an identifier. Already at maximum depth of includes Only four levels of source files are allowed. Couldn't open include file: Prints the error message from the disk drive File number not opened File number out of range File number already in use Directory not updated These messages indicate internal errors in the file-handling routines used by the assembler. For more information, call Lew Lasher at (617) 547-0340